Search AI Products and News

Explore worldwide AI information, discover new AI opportunities

✓AI News
AI Tools

Type :

✓AI News
AI Tools

2025-04-02 10:34:18.AIbase

MiniMax Audio Launches Speech-02 Voice Model: Supports 200,000 Characters at Once

MiniMax Audio, a leading innovator in audio technology, has officially released its new Speech-02 series voice model. Supporting over 30 languages and capable of processing 200,000 characters at once, it delivers a more natural, fluent, and convenient audio experience. The new Speech-02 series is the core highlight of this update. According to the official introduction, this series has significantly improved multilingual support, enabling more accurate and native-sounding pronunciations in various languages. Even more impressively, Speech-

2025-03-20 10:26:58.AIbase

Orpheus TTS: A Next-Generation TTS Model with Human-like Emotional Expression

On March 19th, an open-source text-to-speech (TTS) model called Orpheus TTS was officially launched. This model has quickly gained attention for its human-like emotional expression, natural and fluent voice quality, and ultra-low latency real-time output stream. Orpheus TTS reportedly excels in real-time conversational scenarios and promises to bring new breakthroughs to intelligent voice interaction. Orpheus TTS focuses on low latency and high emotional expression, with core features including: - **Ultra-Low Latency**: Default latency approximately 2

2025-03-03 11:37:51.AIbase

Sesame Releases CSM Voice Model: Transcending the Uncanny Valley with Globally Stunning Realism

Sesame's newly released Conversational Speech Model (CSM) has recently sparked heated discussions on X, lauded as a voice model that sounds "just like a real person." Its stunning naturalness and emotional expressiveness not only make it indistinguishable from human speech for users, but also claim to have successfully overcome the uncanny valley effect in the field of voice technology. With the spread of demonstration videos and user feedback, CSM is rapidly becoming a leader in AI voice technology.

2025-01-17 10:36:59.AIbase

The Overseas Version of Haile AI Releases New Voice Model T2A-01-HD with Higher Quality Audio Generation

2025-01-15 11:00:02.AIbase

iFLYTEK Starfire Simultaneous Translation Voice Model Released: Achieving Human Expert Translator Level

Today, iFLYTEK officially launched its latest research and development achievement, the Starfire simultaneous translation voice model, marking the debut of the first domestic large model with end-to-end speech simultaneous translation capabilities. This innovative technology has significantly improved the translation performance across all scenes compared to iFLYTEK's previous translation technologies, and has greatly shortened the end-to-end response time.

2024-12-26 10:54:51.AIbase

China Telecom's Xingchen Model Selected as One of the 'National Treasures' of the Year

In the 'Top Ten National Treasures' annual selection initiated by the State-owned Assets Supervision and Administration Commission of the State Council, China Telecom's self-developed Xingchen Model made the list thanks to its groundbreaking technological achievements. As the first full-size, full-modal, and domestically produced foundational model system in the country, the Xingchen Model demonstrates exceptional capabilities in semantics, speech, vision, and multimodal fields. In the semantic domain, the Xingchen Model has achieved significant breakthroughs. Leveraging a fully domestic 10,000-card cluster and training framework, this model reaches over 93% of the computational efficiency of NVIDIA's equivalent power, with a training duration ratio that is more efficient.

2024-11-01 10:51:03.AIbase

SAG-AFTRA and AI Startup Ethovox Reach Agreement to Protect Actors' Voice Rights

The Screen Actors Guild-American Federation of Television and Radio Artists (SAG-AFTRA) has recently signed an agreement with the AI company Ethovox to protect actors' rights in artificial intelligence voice models. The primary goal of this agreement is to establish a responsible framework for the use of AI in the entertainment industry, ensuring that actors' rights are upheld in the new technological era. Under this agreement, actors will receive upfront payments when recording their voices and will share in subsequent revenues. This means that actors will be compensated fairly when their voices are used in AI models.

2024-11-01 10:20:19.AIbase

SAG-AFTRA and AI Company Reach Historic Agreement: Establishing Protection Standards for Actor Voice Data

The Screen Actors Guild-American Federation of Television and Radio Artists (SAG-AFTRA) has recently signed a landmark agreement with the AI voice company Ethovox, setting clear regulations for the AI application of actor voice data. This agreement not only provides a reliable framework for the entertainment industry’s use of AI technology but also establishes new standards for protecting actors' rights. According to the agreement, actors will receive dual revenue guarantees: an upfront payment for voice recordings and ongoing revenue sharing from the application of AI voice models. More importantly, any use of voice data requires consent.

2024-09-24 14:50:52.AIbase

Research Finds AI Voice Models Excel in Inductive Reasoning but Face Challenges in Deductive Tasks

Recently, researchers from UCLA and Amazon conducted an in-depth analysis of the reasoning capabilities of large language models (LLMs). They systematically differentiated between inductive and deductive reasoning for the first time, exploring the degree of challenge each poses to AI systems. Inductive reasoning refers to deriving general principles from specific observations, while deductive reasoning involves applying general rules to specific cases. The aim of the study is to understand which reasoning capability presents more challenges for large language models.

2024-09-19 09:02:23.AIbase

Giant Network Launches Self-Developed Character Model GiantGPT and Voice Model BaiLing-TTS

At the opening ceremony of the 2024 Yunqi Conference, Giant Network made its debut and showcased its latest achievements in the "Game + AI" field. The company launched two self-developed large model applications—GiantGPT and BaiLing-TTS—while also demonstrating new technologies such as AI Digital Humans and the AI painting platform Giant Mimicry.

2024-09-05 11:58:52.AIbase

Soul Voice Model Major Upgrade: Real-time End-to-End Voice Calls, Difficult to Distinguish Between Real Humans and AI Virtual Characters!

Soul App has made significant breakthroughs in the AI+Social field by upgrading its self-developed end-to-end full-duplex voice call model, achieving a natural and smooth voice communication experience with virtual characters that nearly matches the realism of talking to a real person. This model features ultra-low interaction latency, rapid automatic interruption, hyper-realistic voice expression, and emotional perception understanding capabilities, making it nearly impossible for users to distinguish whether their conversation partner is a real person or AI. This innovation not only significantly enhances the social experience for users on Soul but also showcases Soul's advancements in AI technology.

2024-08-22 08:34:49.AIbase

Byte's Doubao Voice Model and Visual Model Upgraded, Overall Capability Increased by 20.3%

At the Volcano Engine AI Innovation Roadshow in Shanghai on August 21, 2024, Volcano Engine showcased a comprehensive upgrade of the Doubao Large Model. This includes the Doubao Text-to-Image Model, which has improved text and image matching capabilities for long texts, the Doubao Speech Recognition Model, which reduced error rates by up to 40% across multiple public test sets, and upgrades to the Doubao Speech Synthesis Model, enhancing streaming speech synthesis abilities for real-time responses and accurate punctuation. Volcano Engine also released a real-time interactive solution for Conversational AI, integrating the Doubao Large Model with real-time audio and video technology, providing end-to-end capabilities.

2024-08-10 11:48:08.AIbase

Alibaba Releases New Voice Model Qwen2-Audio, Surpassing OpenAI Whisper

Alibaba recently launched the new open-source voice model Qwen2-Audio, which excels in speech recognition, translation, and audio analysis, achieving significant performance improvements. Qwen2-Audio offers a basic version and an instruction fine-tuning version, supporting multiple languages such as Chinese, Cantonese, French, English, and Japanese, facilitating sentiment analysis and translation applications. Compared to Qwen-Audio, Qwen2-Audio features comprehensive optimizations in architecture and performance, utilizing more natural language prompts during the pre-training phase.

2024-08-04 15:52:17.AIbase

Xiaomi's Voice Model First Implementation: SU7's External Wake-Up Defense Feature Now Live

Xiaomi's SU7 car has undergone a system upgrade to HyperOS 1.2.7, introducing key features like external wake-up defense and front vehicle recognition. This system update brings the highly anticipated 'external wake-up defense' functionality. When the vehicle is in P gear, the central control is locked, and all doors and windows are closed, this feature will automatically activate, effectively blocking voice wake-up attempts from outside, significantly enhancing the vehicle's security. This improvement directly addresses user feedback regarding the safety risks associated with external voice wake-ups.

2024-07-25 14:24:32.AIbase

Comparable to GPT-4o! Fudan Introduces SpeechGPT2, a Voice Model that Can Understand Your Emotions

Fudan University has introduced SpeechGPT, a large language model designed to understand and generate both speech and text. By discretizing speech signals, it enables compatibility with text modalities, allowing for emotional perception and multi-style speech generation based on context and instructions. The training strategy includes modal adaptation pre-training, cross-modal instruction fine-tuning, and modal chain instruction fine-tuning to en....

AI News

AI Daily

AI Timeline

Latest Cases

Image Collection

Video Collection

Audio Collection

Content Collection

Latest Tutorials

AI Product Ranking

AI Traffic Growth Ranking

AI Traffic Decline Ranking

AI Weekly Ranking

United States

China

India

Brazil

Image Generation

Personal Assistant

Character Generation

Video Generation

AI Project Ranking

AI Project Growth Ranking

AI Developer Ranking

AI Organization Ranking

Deepseek

TTS

LLM

ChatGPT

Overview

Search AI Products and News

Explore worldwide AI information, discover new AI opportunities

MiniMax Audio Launches Speech-02 Voice Model: Supports 200,000 Characters at Once

Orpheus TTS: A Next-Generation TTS Model with Human-like Emotional Expression

Sesame Releases CSM Voice Model: Transcending the Uncanny Valley with Globally Stunning Realism

The Overseas Version of Haile AI Releases New Voice Model T2A-01-HD with Higher Quality Audio Generation

iFLYTEK Starfire Simultaneous Translation Voice Model Released: Achieving Human Expert Translator Level

China Telecom's Xingchen Model Selected as One of the 'National Treasures' of the Year

SAG-AFTRA and AI Startup Ethovox Reach Agreement to Protect Actors' Voice Rights

SAG-AFTRA and AI Company Reach Historic Agreement: Establishing Protection Standards for Actor Voice Data

Research Finds AI Voice Models Excel in Inductive Reasoning but Face Challenges in Deductive Tasks

Giant Network Launches Self-Developed Character Model GiantGPT and Voice Model BaiLing-TTS

Soul Voice Model Major Upgrade: Real-time End-to-End Voice Calls, Difficult to Distinguish Between Real Humans and AI Virtual Characters!

Byte's Doubao Voice Model and Visual Model Upgraded, Overall Capability Increased by 20.3%

Alibaba Releases New Voice Model Qwen2-Audio, Surpassing OpenAI Whisper

Xiaomi's Voice Model First Implementation: SU7's External Wake-Up Defense Feature Now Live

Comparable to GPT-4o! Fudan Introduces SpeechGPT2, a Voice Model that Can Understand Your Emotions